Overview

Dataset statistics

Number of variables40
Number of observations356105
Missing cells720038
Missing cells (%)5.1%
Duplicate rows90
Duplicate rows (%)< 0.1%
Total size in memory744.2 MiB
Average record size in memory2.1 KiB

Variable types

Text1
Categorical11
Numeric6
Boolean22

Alerts

Dataset has 90 (< 0.1%) duplicate rowsDuplicates
HadHeartAttack is highly imbalanced (68.6%)Imbalance
HadAngina is highly imbalanced (67.2%)Imbalance
HadStroke is highly imbalanced (74.1%)Imbalance
HadSkinCancer is highly imbalanced (59.7%)Imbalance
HadCOPD is highly imbalanced (59.6%)Imbalance
HadKidneyDisease is highly imbalanced (73.1%)Imbalance
HadDiabetes is highly imbalanced (59.9%)Imbalance
DeafOrHardOfHearing is highly imbalanced (55.8%)Imbalance
BlindOrVisionDifficulty is highly imbalanced (69.0%)Imbalance
DifficultyDressingBathing is highly imbalanced (75.8%)Imbalance
DifficultyErrands is highly imbalanced (60.8%)Imbalance
HighRiskLastYear is highly imbalanced (74.2%)Imbalance
PhysicalHealthDays has 8691 (2.4%) missing valuesMissing
MentalHealthDays has 7257 (2.0%) missing valuesMissing
LastCheckupTime has 6597 (1.9%) missing valuesMissing
SleepHours has 4349 (1.2%) missing valuesMissing
RemovedTeeth has 9042 (2.5%) missing valuesMissing
DeafOrHardOfHearing has 16402 (4.6%) missing valuesMissing
BlindOrVisionDifficulty has 17143 (4.8%) missing valuesMissing
DifficultyConcentrating has 19276 (5.4%) missing valuesMissing
DifficultyWalking has 19070 (5.4%) missing valuesMissing
DifficultyDressingBathing has 19012 (5.3%) missing valuesMissing
DifficultyErrands has 20405 (5.7%) missing valuesMissing
SmokerStatus has 28274 (7.9%) missing valuesMissing
ECigaretteUsage has 28412 (8.0%) missing valuesMissing
ChestScan has 44791 (12.6%) missing valuesMissing
RaceEthnicityCategory has 11276 (3.2%) missing valuesMissing
AgeCategory has 7234 (2.0%) missing valuesMissing
HeightInMeters has 22742 (6.4%) missing valuesMissing
WeightInKilograms has 33567 (9.4%) missing valuesMissing
BMI has 38866 (10.9%) missing valuesMissing
AlcoholDrinkers has 37137 (10.4%) missing valuesMissing
HIVTesting has 52878 (14.8%) missing valuesMissing
FluVaxLast12 has 37622 (10.6%) missing valuesMissing
PneumoVaxEver has 61588 (17.3%) missing valuesMissing
TetanusLast10Tdap has 65972 (18.5%) missing valuesMissing
HighRiskLastYear has 40437 (11.4%) missing valuesMissing
CovidPos has 40556 (11.4%) missing valuesMissing
PhysicalHealthDays has 214245 (60.2%) zerosZeros
MentalHealthDays has 212324 (59.6%) zerosZeros

Reproduction

Analysis started2024-07-09 22:22:34.772788
Analysis finished2024-07-09 22:23:00.127626
Duration25.35 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

State
Text

Distinct54
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.2 MiB
2024-07-09T19:23:00.214988image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length20
Median length12
Mean length8.350919
Min length4

Characters and Unicode

Total characters2973804
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUtah
2nd rowDistrict of Columbia
3rd rowWashington
4th rowWisconsin
5th rowKansas
ValueCountFrequency (%)
new 29957
 
7.0%
washington 20900
 
4.9%
york 14289
 
3.4%
south 13960
 
3.3%
minnesota 13495
 
3.2%
ohio 13186
 
3.1%
maryland 13117
 
3.1%
virginia 12400
 
2.9%
carolina 11617
 
2.7%
texas 11362
 
2.7%
Other values (50) 272236
63.8%
2024-07-09T19:23:00.429719image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 383478
12.9%
i 282987
 
9.5%
n 266958
 
9.0%
o 251579
 
8.5%
s 207343
 
7.0%
e 173280
 
5.8%
r 151899
 
5.1%
t 135382
 
4.6%
h 99402
 
3.3%
l 86420
 
2.9%
Other values (36) 935076
31.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2973804
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 383478
12.9%
i 282987
 
9.5%
n 266958
 
9.0%
o 251579
 
8.5%
s 207343
 
7.0%
e 173280
 
5.8%
r 151899
 
5.1%
t 135382
 
4.6%
h 99402
 
3.3%
l 86420
 
2.9%
Other values (36) 935076
31.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2973804
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 383478
12.9%
i 282987
 
9.5%
n 266958
 
9.0%
o 251579
 
8.5%
s 207343
 
7.0%
e 173280
 
5.8%
r 151899
 
5.1%
t 135382
 
4.6%
h 99402
 
3.3%
l 86420
 
2.9%
Other values (36) 935076
31.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2973804
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 383478
12.9%
i 282987
 
9.5%
n 266958
 
9.0%
o 251579
 
8.5%
s 207343
 
7.0%
e 173280
 
5.8%
r 151899
 
5.1%
t 135382
 
4.6%
h 99402
 
3.3%
l 86420
 
2.9%
Other values (36) 935076
31.4%

Sex
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.1 MiB
Female
188765 
Male
167340 

Length

Max length6
Median length6
Mean length5.0601648
Min length4

Characters and Unicode

Total characters1801950
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowMale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Female 188765
53.0%
Male 167340
47.0%

Length

2024-07-09T19:23:00.516096image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:00.594123image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
female 188765
53.0%
male 167340
47.0%

Most occurring characters

ValueCountFrequency (%)
e 544870
30.2%
a 356105
19.8%
l 356105
19.8%
F 188765
 
10.5%
m 188765
 
10.5%
M 167340
 
9.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1801950
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 544870
30.2%
a 356105
19.8%
l 356105
19.8%
F 188765
 
10.5%
m 188765
 
10.5%
M 167340
 
9.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1801950
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 544870
30.2%
a 356105
19.8%
l 356105
19.8%
F 188765
 
10.5%
m 188765
 
10.5%
M 167340
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1801950
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 544870
30.2%
a 356105
19.8%
l 356105
19.8%
F 188765
 
10.5%
m 188765
 
10.5%
M 167340
 
9.3%

GeneralHealth
Categorical

Distinct5
Distinct (%)< 0.1%
Missing937
Missing (%)0.3%
Memory size21.6 MiB
Very good
118824 
Good
114901 
Excellent
57393 
Fair
48234 
Poor
15816 

Length

Max length9
Median length4
Mean length6.4807556
Min length4

Characters and Unicode

Total characters2301757
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVery good
2nd rowFair
3rd rowGood
4th rowGood
5th rowExcellent

Common Values

ValueCountFrequency (%)
Very good 118824
33.4%
Good 114901
32.3%
Excellent 57393
16.1%
Fair 48234
13.5%
Poor 15816
 
4.4%
(Missing) 937
 
0.3%

Length

2024-07-09T19:23:00.658020image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:00.727304image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
good 233725
49.3%
very 118824
25.1%
excellent 57393
 
12.1%
fair 48234
 
10.2%
poor 15816
 
3.3%

Most occurring characters

ValueCountFrequency (%)
o 499082
21.7%
d 233725
10.2%
e 233610
10.1%
r 182874
 
7.9%
V 118824
 
5.2%
y 118824
 
5.2%
118824
 
5.2%
g 118824
 
5.2%
G 114901
 
5.0%
l 114786
 
5.0%
Other values (9) 447483
19.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2301757
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 499082
21.7%
d 233725
10.2%
e 233610
10.1%
r 182874
 
7.9%
V 118824
 
5.2%
y 118824
 
5.2%
118824
 
5.2%
g 118824
 
5.2%
G 114901
 
5.0%
l 114786
 
5.0%
Other values (9) 447483
19.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2301757
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 499082
21.7%
d 233725
10.2%
e 233610
10.1%
r 182874
 
7.9%
V 118824
 
5.2%
y 118824
 
5.2%
118824
 
5.2%
g 118824
 
5.2%
G 114901
 
5.0%
l 114786
 
5.0%
Other values (9) 447483
19.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2301757
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 499082
21.7%
d 233725
10.2%
e 233610
10.1%
r 182874
 
7.9%
V 118824
 
5.2%
y 118824
 
5.2%
118824
 
5.2%
g 118824
 
5.2%
G 114901
 
5.0%
l 114786
 
5.0%
Other values (9) 447483
19.4%

PhysicalHealthDays
Real number (ℝ)

MISSING  ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing8691
Missing (%)2.4%
Infinite0
Infinite (%)0.0%
Mean4.3530456
Minimum0
Maximum30
Zeros214245
Zeros (%)60.2%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2024-07-09T19:23:00.799006image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)3

Descriptive statistics

Standard deviation8.6955881
Coefficient of variation (CV)1.9975872
Kurtosis3.4163046
Mean4.3530456
Median Absolute Deviation (MAD)0
Skewness2.1776631
Sum1512309
Variance75.613253
MonotonicityNot monotonic
2024-07-09T19:23:00.870251image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 214245
60.2%
30 26531
 
7.5%
2 20277
 
5.7%
1 13779
 
3.9%
3 12725
 
3.6%
5 12146
 
3.4%
10 8540
 
2.4%
7 7494
 
2.1%
15 6940
 
1.9%
4 6793
 
1.9%
Other values (21) 17944
 
5.0%
(Missing) 8691
 
2.4%
ValueCountFrequency (%)
0 214245
60.2%
1 13779
 
3.9%
2 20277
 
5.7%
3 12725
 
3.6%
4 6793
 
1.9%
5 12146
 
3.4%
6 2019
 
0.6%
7 7494
 
2.1%
8 1441
 
0.4%
9 333
 
0.1%
ValueCountFrequency (%)
30 26531
7.5%
29 287
 
0.1%
28 605
 
0.2%
27 157
 
< 0.1%
26 92
 
< 0.1%
25 1725
 
0.5%
24 103
 
< 0.1%
23 70
 
< 0.1%
22 115
 
< 0.1%
21 830
 
0.2%

MentalHealthDays
Real number (ℝ)

MISSING  ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing7257
Missing (%)2.0%
Infinite0
Infinite (%)0.0%
Mean4.3817938
Minimum0
Maximum30
Zeros212324
Zeros (%)59.6%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2024-07-09T19:23:00.940739image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q35
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)5

Descriptive statistics

Standard deviation8.38843
Coefficient of variation (CV)1.9143827
Kurtosis3.3562411
Mean4.3817938
Median Absolute Deviation (MAD)0
Skewness2.1228382
Sum1528580
Variance70.365758
MonotonicityNot monotonic
2024-07-09T19:23:01.011770image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 212324
59.6%
30 21562
 
6.1%
2 19015
 
5.3%
5 15940
 
4.5%
10 12267
 
3.4%
3 12233
 
3.4%
15 11627
 
3.3%
1 11474
 
3.2%
20 7328
 
2.1%
4 6358
 
1.8%
Other values (21) 18720
 
5.3%
(Missing) 7257
 
2.0%
ValueCountFrequency (%)
0 212324
59.6%
1 11474
 
3.2%
2 19015
 
5.3%
3 12233
 
3.4%
4 6358
 
1.8%
5 15940
 
4.5%
6 1879
 
0.5%
7 6292
 
1.8%
8 1366
 
0.4%
9 258
 
0.1%
ValueCountFrequency (%)
30 21562
6.1%
29 401
 
0.1%
28 740
 
0.2%
27 190
 
0.1%
26 87
 
< 0.1%
25 2500
 
0.7%
24 105
 
< 0.1%
23 72
 
< 0.1%
22 153
 
< 0.1%
21 450
 
0.1%

LastCheckupTime
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing6597
Missing (%)1.9%
Memory size35.8 MiB
Within past year (anytime less than 12 months ago)
281044 
Within past 2 years (1 year but less than 2 years ago)
33420 
Within past 5 years (2 years but less than 5 years ago)
 
19780
5 or more years ago
 
15264

Length

Max length55
Median length50
Mean length49.311592
Min length19

Characters and Unicode

Total characters17234796
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWithin past 5 years (2 years but less than 5 years ago)
2nd rowWithin past year (anytime less than 12 months ago)
3rd rowWithin past year (anytime less than 12 months ago)
4th rowWithin past 5 years (2 years but less than 5 years ago)
5th rowWithin past year (anytime less than 12 months ago)

Common Values

ValueCountFrequency (%)
Within past year (anytime less than 12 months ago) 281044
78.9%
Within past 2 years (1 year but less than 2 years ago) 33420
 
9.4%
Within past 5 years (2 years but less than 5 years ago) 19780
 
5.6%
5 or more years ago 15264
 
4.3%
(Missing) 6597
 
1.9%

Length

2024-07-09T19:23:01.095940image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:01.163191image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
ago 349508
10.8%
within 334244
10.3%
past 334244
10.3%
less 334244
10.3%
than 334244
10.3%
year 314464
9.7%
anytime 281044
8.7%
12 281044
8.7%
months 281044
8.7%
years 141444
4.4%
Other values (6) 258592
8.0%

Most occurring characters

ValueCountFrequency (%)
2894608
16.8%
a 1754948
10.2%
t 1618020
9.4%
s 1425220
 
8.3%
n 1230576
 
7.1%
e 1086460
 
6.3%
h 949532
 
5.5%
i 949532
 
5.5%
y 736952
 
4.3%
o 661080
 
3.8%
Other values (13) 3927868
22.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 17234796
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2894608
16.8%
a 1754948
10.2%
t 1618020
9.4%
s 1425220
 
8.3%
n 1230576
 
7.1%
e 1086460
 
6.3%
h 949532
 
5.5%
i 949532
 
5.5%
y 736952
 
4.3%
o 661080
 
3.8%
Other values (13) 3927868
22.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 17234796
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2894608
16.8%
a 1754948
10.2%
t 1618020
9.4%
s 1425220
 
8.3%
n 1230576
 
7.1%
e 1086460
 
6.3%
h 949532
 
5.5%
i 949532
 
5.5%
y 736952
 
4.3%
o 661080
 
3.8%
Other values (13) 3927868
22.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 17234796
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2894608
16.8%
a 1754948
10.2%
t 1618020
9.4%
s 1425220
 
8.3%
n 1230576
 
7.1%
e 1086460
 
6.3%
h 949532
 
5.5%
i 949532
 
5.5%
y 736952
 
4.3%
o 661080
 
3.8%
Other values (13) 3927868
22.8%
Distinct2
Distinct (%)< 0.1%
Missing871
Missing (%)0.2%
Memory size695.6 KiB
True
269962 
False
85272 
(Missing)
 
871
ValueCountFrequency (%)
True 269962
75.8%
False 85272
 
23.9%
(Missing) 871
 
0.2%
2024-07-09T19:23:01.225554image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

SleepHours
Real number (ℝ)

MISSING 

Distinct24
Distinct (%)< 0.1%
Missing4349
Missing (%)1.2%
Infinite0
Infinite (%)0.0%
Mean7.0211169
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2024-07-09T19:23:01.286466image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q16
median7
Q38
95-th percentile9
Maximum24
Range23
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5031697
Coefficient of variation (CV)0.21409267
Kurtosis8.747918
Mean7.0211169
Median Absolute Deviation (MAD)1
Skewness0.76438391
Sum2469720
Variance2.2595191
MonotonicityNot monotonic
2024-07-09T19:23:01.359257image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
7 106273
29.8%
8 100285
28.2%
6 76795
21.6%
5 24118
 
6.8%
9 16922
 
4.8%
4 10064
 
2.8%
10 8358
 
2.3%
3 2589
 
0.7%
12 2400
 
0.7%
2 1234
 
0.3%
Other values (14) 2718
 
0.8%
(Missing) 4349
 
1.2%
ValueCountFrequency (%)
1 933
 
0.3%
2 1234
 
0.3%
3 2589
 
0.7%
4 10064
 
2.8%
5 24118
 
6.8%
6 76795
21.6%
7 106273
29.8%
8 100285
28.2%
9 16922
 
4.8%
10 8358
 
2.3%
ValueCountFrequency (%)
24 43
 
< 0.1%
23 15
 
< 0.1%
22 14
 
< 0.1%
21 3
 
< 0.1%
20 113
< 0.1%
19 15
 
< 0.1%
18 127
< 0.1%
17 22
 
< 0.1%
16 263
0.1%
15 266
0.1%

RemovedTeeth
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing9042
Missing (%)2.5%
Memory size22.9 MiB
None of them
186970 
1 to 5
103275 
6 or more, but not all
36515 
All
20303 

Length

Max length22
Median length12
Mean length10.740209
Min length3

Characters and Unicode

Total characters3727529
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone of them
2nd row6 or more, but not all
3rd rowNone of them
4th row6 or more, but not all
5th row1 to 5

Common Values

ValueCountFrequency (%)
None of them 186970
52.5%
1 to 5 103275
29.0%
6 or more, but not all 36515
 
10.3%
All 20303
 
5.7%
(Missing) 9042
 
2.5%

Length

2024-07-09T19:23:01.438351image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:01.506872image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
none 186970
16.8%
of 186970
16.8%
them 186970
16.8%
1 103275
9.3%
to 103275
9.3%
5 103275
9.3%
all 56818
 
5.1%
6 36515
 
3.3%
or 36515
 
3.3%
more 36515
 
3.3%
Other values (2) 73030
 
6.6%

Most occurring characters

ValueCountFrequency (%)
763065
20.5%
o 586760
15.7%
e 410455
11.0%
t 363275
9.7%
n 223485
 
6.0%
m 223485
 
6.0%
N 186970
 
5.0%
f 186970
 
5.0%
h 186970
 
5.0%
l 113636
 
3.0%
Other values (9) 482458
12.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3727529
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
763065
20.5%
o 586760
15.7%
e 410455
11.0%
t 363275
9.7%
n 223485
 
6.0%
m 223485
 
6.0%
N 186970
 
5.0%
f 186970
 
5.0%
h 186970
 
5.0%
l 113636
 
3.0%
Other values (9) 482458
12.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3727529
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
763065
20.5%
o 586760
15.7%
e 410455
11.0%
t 363275
9.7%
n 223485
 
6.0%
m 223485
 
6.0%
N 186970
 
5.0%
f 186970
 
5.0%
h 186970
 
5.0%
l 113636
 
3.0%
Other values (9) 482458
12.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3727529
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
763065
20.5%
o 586760
15.7%
e 410455
11.0%
t 363275
9.7%
n 223485
 
6.0%
m 223485
 
6.0%
N 186970
 
5.0%
f 186970
 
5.0%
h 186970
 
5.0%
l 113636
 
3.0%
Other values (9) 482458
12.9%

HadHeartAttack
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing2473
Missing (%)0.7%
Memory size695.6 KiB
False
333575 
True
 
20057
(Missing)
 
2473
ValueCountFrequency (%)
False 333575
93.7%
True 20057
 
5.6%
(Missing) 2473
 
0.7%
2024-07-09T19:23:01.568636image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HadAngina
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing3527
Missing (%)1.0%
Memory size695.6 KiB
False
331415 
True
 
21163
(Missing)
 
3527
ValueCountFrequency (%)
False 331415
93.1%
True 21163
 
5.9%
(Missing) 3527
 
1.0%
2024-07-09T19:23:01.621820image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HadStroke
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1250
Missing (%)0.4%
Memory size695.6 KiB
False
339386 
True
 
15469
(Missing)
 
1250
ValueCountFrequency (%)
False 339386
95.3%
True 15469
 
4.3%
(Missing) 1250
 
0.4%
2024-07-09T19:23:01.673709image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HadAsthma
Boolean

Distinct2
Distinct (%)< 0.1%
Missing1407
Missing (%)0.4%
Memory size695.6 KiB
False
301289 
True
53409 
(Missing)
 
1407
ValueCountFrequency (%)
False 301289
84.6%
True 53409
 
15.0%
(Missing) 1407
 
0.4%
2024-07-09T19:23:01.727313image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HadSkinCancer
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing2543
Missing (%)0.7%
Memory size695.6 KiB
False
325224 
True
 
28338
(Missing)
 
2543
ValueCountFrequency (%)
False 325224
91.3%
True 28338
 
8.0%
(Missing) 2543
 
0.7%
2024-07-09T19:23:01.782936image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HadCOPD
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1743
Missing (%)0.5%
Memory size695.6 KiB
False
325805 
True
 
28557
(Missing)
 
1743
ValueCountFrequency (%)
False 325805
91.5%
True 28557
 
8.0%
(Missing) 1743
 
0.5%
2024-07-09T19:23:01.837495image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing2220
Missing (%)0.6%
Memory size695.6 KiB
False
280826 
True
73059 
(Missing)
 
2220
ValueCountFrequency (%)
False 280826
78.9%
True 73059
 
20.5%
(Missing) 2220
 
0.6%
2024-07-09T19:23:01.890811image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HadKidneyDisease
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1519
Missing (%)0.4%
Memory size695.6 KiB
False
338279 
True
 
16307
(Missing)
 
1519
ValueCountFrequency (%)
False 338279
95.0%
True 16307
 
4.6%
(Missing) 1519
 
0.4%
2024-07-09T19:23:01.945494image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing2087
Missing (%)0.6%
Memory size695.6 KiB
False
233132 
True
120886 
(Missing)
 
2087
ValueCountFrequency (%)
False 233132
65.5%
True 120886
33.9%
(Missing) 2087
 
0.6%
2024-07-09T19:23:01.999202image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HadDiabetes
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing865
Missing (%)0.2%
Memory size20.5 MiB
No
294997 
Yes
48871 
No, pre-diabetes or borderline diabetes
 
8257
Yes, but only during pregnancy (female)
 
3115

Length

Max length39
Median length2
Mean length3.3220217
Min length2

Characters and Unicode

Total characters1180115
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowYes
3rd rowYes
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 294997
82.8%
Yes 48871
 
13.7%
No, pre-diabetes or borderline diabetes 8257
 
2.3%
Yes, but only during pregnancy (female) 3115
 
0.9%
(Missing) 865
 
0.2%

Length

2024-07-09T19:23:02.069976image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:02.137079image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
no 303254
75.1%
yes 51986
 
12.9%
pre-diabetes 8257
 
2.0%
or 8257
 
2.0%
borderline 8257
 
2.0%
diabetes 8257
 
2.0%
but 3115
 
0.8%
only 3115
 
0.8%
during 3115
 
0.8%
pregnancy 3115
 
0.8%

Most occurring characters

ValueCountFrequency (%)
o 322883
27.4%
N 303254
25.7%
e 119130
 
10.1%
s 68500
 
5.8%
Y 51986
 
4.4%
48603
 
4.1%
r 39258
 
3.3%
d 27886
 
2.4%
b 27886
 
2.4%
i 27886
 
2.4%
Other values (15) 142843
12.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1180115
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 322883
27.4%
N 303254
25.7%
e 119130
 
10.1%
s 68500
 
5.8%
Y 51986
 
4.4%
48603
 
4.1%
r 39258
 
3.3%
d 27886
 
2.4%
b 27886
 
2.4%
i 27886
 
2.4%
Other values (15) 142843
12.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1180115
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 322883
27.4%
N 303254
25.7%
e 119130
 
10.1%
s 68500
 
5.8%
Y 51986
 
4.4%
48603
 
4.1%
r 39258
 
3.3%
d 27886
 
2.4%
b 27886
 
2.4%
i 27886
 
2.4%
Other values (15) 142843
12.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1180115
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 322883
27.4%
N 303254
25.7%
e 119130
 
10.1%
s 68500
 
5.8%
Y 51986
 
4.4%
48603
 
4.1%
r 39258
 
3.3%
d 27886
 
2.4%
b 27886
 
2.4%
i 27886
 
2.4%
Other values (15) 142843
12.1%

DeafOrHardOfHearing
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing16402
Missing (%)4.6%
Memory size695.6 KiB
False
308525 
True
31178 
(Missing)
 
16402
ValueCountFrequency (%)
False 308525
86.6%
True 31178
 
8.8%
(Missing) 16402
 
4.6%
2024-07-09T19:23:02.197263image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

BlindOrVisionDifficulty
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing17143
Missing (%)4.8%
Memory size695.6 KiB
False
320099 
True
 
18863
(Missing)
 
17143
ValueCountFrequency (%)
False 320099
89.9%
True 18863
 
5.3%
(Missing) 17143
 
4.8%
2024-07-09T19:23:02.247988image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

DifficultyConcentrating
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing19276
Missing (%)5.4%
Memory size695.6 KiB
False
296825 
True
40004 
(Missing)
 
19276
ValueCountFrequency (%)
False 296825
83.4%
True 40004
 
11.2%
(Missing) 19276
 
5.4%
2024-07-09T19:23:02.299775image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

DifficultyWalking
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing19070
Missing (%)5.4%
Memory size695.6 KiB
False
282576 
True
54459 
(Missing)
 
19070
ValueCountFrequency (%)
False 282576
79.4%
True 54459
 
15.3%
(Missing) 19070
 
5.4%
2024-07-09T19:23:02.350292image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

DifficultyDressingBathing
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing19012
Missing (%)5.3%
Memory size695.6 KiB
False
323648 
True
 
13445
(Missing)
 
19012
ValueCountFrequency (%)
False 323648
90.9%
True 13445
 
3.8%
(Missing) 19012
 
5.3%
2024-07-09T19:23:02.403713image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

DifficultyErrands
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing20405
Missing (%)5.7%
Memory size695.6 KiB
False
309841 
True
 
25859
(Missing)
 
20405
ValueCountFrequency (%)
False 309841
87.0%
True 25859
 
7.3%
(Missing) 20405
 
5.7%
2024-07-09T19:23:02.452896image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

SmokerStatus
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing28274
Missing (%)7.9%
Memory size24.1 MiB
Never smoked
197070 
Former smoker
90899 
Current smoker - now smokes every day
28771 
Current smoker - now smokes some days
 
11091

Length

Max length37
Median length12
Mean length15.317102
Min length12

Characters and Unicode

Total characters5021421
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNever smoked
2nd rowNever smoked
3rd rowFormer smoker
4th rowNever smoked
5th rowFormer smoker

Common Values

ValueCountFrequency (%)
Never smoked 197070
55.3%
Former smoker 90899
25.5%
Current smoker - now smokes every day 28771
 
8.1%
Current smoker - now smokes some days 11091
 
3.1%
(Missing) 28274
 
7.9%

Length

2024-07-09T19:23:02.512971image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:02.577008image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
never 197070
23.0%
smoked 197070
23.0%
smoker 130761
15.3%
former 90899
10.6%
current 39862
 
4.7%
39862
 
4.7%
now 39862
 
4.7%
smokes 39862
 
4.7%
every 28771
 
3.4%
day 28771
 
3.4%
Other values (2) 22182
 
2.6%

Most occurring characters

ValueCountFrequency (%)
e 961227
19.1%
r 618124
12.3%
527141
10.5%
o 509545
10.1%
m 469683
9.4%
s 429737
8.6%
k 367693
 
7.3%
d 236932
 
4.7%
v 225841
 
4.5%
N 197070
 
3.9%
Other values (9) 478428
9.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5021421
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 961227
19.1%
r 618124
12.3%
527141
10.5%
o 509545
10.1%
m 469683
9.4%
s 429737
8.6%
k 367693
 
7.3%
d 236932
 
4.7%
v 225841
 
4.5%
N 197070
 
3.9%
Other values (9) 478428
9.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5021421
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 961227
19.1%
r 618124
12.3%
527141
10.5%
o 509545
10.1%
m 469683
9.4%
s 429737
8.6%
k 367693
 
7.3%
d 236932
 
4.7%
v 225841
 
4.5%
N 197070
 
3.9%
Other values (9) 478428
9.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5021421
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 961227
19.1%
r 618124
12.3%
527141
10.5%
o 509545
10.1%
m 469683
9.4%
s 429737
8.6%
k 367693
 
7.3%
d 236932
 
4.7%
v 225841
 
4.5%
N 197070
 
3.9%
Other values (9) 478428
9.5%

ECigaretteUsage
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing28412
Missing (%)8.0%
Memory size30.7 MiB
Never used e-cigarettes in my entire life
249651 
Not at all (right now)
60334 
Use them some days
 
9440
Use them every day
 
8268

Length

Max length41
Median length41
Mean length36.258886
Min length18

Characters and Unicode

Total characters11881783
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all (right now)
2nd rowNever used e-cigarettes in my entire life
3rd rowNever used e-cigarettes in my entire life
4th rowNever used e-cigarettes in my entire life
5th rowNot at all (right now)

Common Values

ValueCountFrequency (%)
Never used e-cigarettes in my entire life 249651
70.1%
Not at all (right now) 60334
 
16.9%
Use them some days 9440
 
2.7%
Use them every day 8268
 
2.3%
(Missing) 28412
 
8.0%

Length

2024-07-09T19:23:02.649465image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:02.710204image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
never 249651
11.8%
used 249651
11.8%
e-cigarettes 249651
11.8%
in 249651
11.8%
my 249651
11.8%
entire 249651
11.8%
life 249651
11.8%
now 60334
 
2.8%
right 60334
 
2.8%
all 60334
 
2.8%
Other values (8) 191500
9.0%

Most occurring characters

ValueCountFrequency (%)
e 2308251
19.4%
1792366
15.1%
i 1058938
 
8.9%
t 947663
 
8.0%
r 817555
 
6.9%
n 559636
 
4.7%
s 535890
 
4.5%
a 388027
 
3.3%
l 370319
 
3.1%
g 309985
 
2.6%
Other values (15) 2793153
23.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11881783
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2308251
19.4%
1792366
15.1%
i 1058938
 
8.9%
t 947663
 
8.0%
r 817555
 
6.9%
n 559636
 
4.7%
s 535890
 
4.5%
a 388027
 
3.3%
l 370319
 
3.1%
g 309985
 
2.6%
Other values (15) 2793153
23.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11881783
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2308251
19.4%
1792366
15.1%
i 1058938
 
8.9%
t 947663
 
8.0%
r 817555
 
6.9%
n 559636
 
4.7%
s 535890
 
4.5%
a 388027
 
3.3%
l 370319
 
3.1%
g 309985
 
2.6%
Other values (15) 2793153
23.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11881783
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2308251
19.4%
1792366
15.1%
i 1058938
 
8.9%
t 947663
 
8.0%
r 817555
 
6.9%
n 559636
 
4.7%
s 535890
 
4.5%
a 388027
 
3.3%
l 370319
 
3.1%
g 309985
 
2.6%
Other values (15) 2793153
23.5%

ChestScan
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing44791
Missing (%)12.6%
Memory size695.6 KiB
False
178411 
True
132903 
(Missing)
44791 
ValueCountFrequency (%)
False 178411
50.1%
True 132903
37.3%
(Missing) 44791
 
12.6%
2024-07-09T19:23:02.769325image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

RaceEthnicityCategory
Categorical

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing11276
Missing (%)3.2%
Memory size26.8 MiB
White only, Non-Hispanic
256215 
Hispanic
34394 
Black only, Non-Hispanic
28368 
Other race only, Non-Hispanic
 
18214
Multiracial, Non-Hispanic
 
7638

Length

Max length29
Median length24
Mean length22.690377
Min length8

Characters and Unicode

Total characters7824300
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWhite only, Non-Hispanic
2nd rowBlack only, Non-Hispanic
3rd rowWhite only, Non-Hispanic
4th rowBlack only, Non-Hispanic
5th rowBlack only, Non-Hispanic

Common Values

ValueCountFrequency (%)
White only, Non-Hispanic 256215
71.9%
Hispanic 34394
 
9.7%
Black only, Non-Hispanic 28368
 
8.0%
Other race only, Non-Hispanic 18214
 
5.1%
Multiracial, Non-Hispanic 7638
 
2.1%
(Missing) 11276
 
3.2%

Length

2024-07-09T19:23:02.833014image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:03.047856image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
non-hispanic 310435
31.8%
only 302797
31.0%
white 256215
26.2%
hispanic 34394
 
3.5%
black 28368
 
2.9%
other 18214
 
1.9%
race 18214
 
1.9%
multiracial 7638
 
0.8%

Most occurring characters

ValueCountFrequency (%)
i 961149
 
12.3%
n 958061
 
12.2%
631446
 
8.1%
o 613232
 
7.8%
a 406687
 
5.2%
c 399049
 
5.1%
l 346441
 
4.4%
H 344829
 
4.4%
s 344829
 
4.4%
p 344829
 
4.4%
Other values (14) 2473748
31.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7824300
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 961149
 
12.3%
n 958061
 
12.2%
631446
 
8.1%
o 613232
 
7.8%
a 406687
 
5.2%
c 399049
 
5.1%
l 346441
 
4.4%
H 344829
 
4.4%
s 344829
 
4.4%
p 344829
 
4.4%
Other values (14) 2473748
31.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7824300
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 961149
 
12.3%
n 958061
 
12.2%
631446
 
8.1%
o 613232
 
7.8%
a 406687
 
5.2%
c 399049
 
5.1%
l 346441
 
4.4%
H 344829
 
4.4%
s 344829
 
4.4%
p 344829
 
4.4%
Other values (14) 2473748
31.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7824300
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 961149
 
12.3%
n 958061
 
12.2%
631446
 
8.1%
o 613232
 
7.8%
a 406687
 
5.2%
c 399049
 
5.1%
l 346441
 
4.4%
H 344829
 
4.4%
s 344829
 
4.4%
p 344829
 
4.4%
Other values (14) 2473748
31.6%

AgeCategory
Categorical

MISSING 

Distinct13
Distinct (%)< 0.1%
Missing7234
Missing (%)2.0%
Memory size23.4 MiB
Age 65 to 69
37760 
Age 60 to 64
35696 
Age 70 to 74
34796 
Age 55 to 59
29454 
Age 80 or older
28984 
Other values (8)
182181 

Length

Max length15
Median length12
Mean length12.249238
Min length12

Characters and Unicode

Total characters4273404
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAge 45 to 49
2nd rowAge 75 to 79
3rd rowAge 75 to 79
4th rowAge 75 to 79
5th rowAge 70 to 74

Common Values

ValueCountFrequency (%)
Age 65 to 69 37760
10.6%
Age 60 to 64 35696
10.0%
Age 70 to 74 34796
9.8%
Age 55 to 59 29454
8.3%
Age 80 or older 28984
8.1%
Age 50 to 54 26758
7.5%
Age 75 to 79 26042
7.3%
Age 40 to 44 24019
6.7%
Age 35 to 39 22952
 
6.4%
Age 45 to 49 22755
 
6.4%
Other values (3) 59655
16.8%

Length

2024-07-09T19:23:03.127376image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
age 348871
25.0%
to 319887
22.9%
65 37760
 
2.7%
69 37760
 
2.7%
60 35696
 
2.6%
64 35696
 
2.6%
70 34796
 
2.5%
74 34796
 
2.5%
55 29454
 
2.1%
59 29454
 
2.1%
Other values (19) 451314
32.3%

Most occurring characters

ValueCountFrequency (%)
1046613
24.5%
e 377855
 
8.8%
o 377855
 
8.8%
A 348871
 
8.2%
g 348871
 
8.2%
t 319887
 
7.5%
5 268869
 
6.3%
4 256990
 
6.0%
0 170867
 
4.0%
9 156445
 
3.7%
Other values (9) 600281
14.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4273404
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1046613
24.5%
e 377855
 
8.8%
o 377855
 
8.8%
A 348871
 
8.2%
g 348871
 
8.2%
t 319887
 
7.5%
5 268869
 
6.3%
4 256990
 
6.0%
0 170867
 
4.0%
9 156445
 
3.7%
Other values (9) 600281
14.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4273404
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1046613
24.5%
e 377855
 
8.8%
o 377855
 
8.8%
A 348871
 
8.2%
g 348871
 
8.2%
t 319887
 
7.5%
5 268869
 
6.3%
4 256990
 
6.0%
0 170867
 
4.0%
9 156445
 
3.7%
Other values (9) 600281
14.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4273404
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1046613
24.5%
e 377855
 
8.8%
o 377855
 
8.8%
A 348871
 
8.2%
g 348871
 
8.2%
t 319887
 
7.5%
5 268869
 
6.3%
4 256990
 
6.0%
0 170867
 
4.0%
9 156445
 
3.7%
Other values (9) 600281
14.0%

HeightInMeters
Real number (ℝ)

MISSING 

Distinct107
Distinct (%)< 0.1%
Missing22742
Missing (%)6.4%
Infinite0
Infinite (%)0.0%
Mean1.7025849
Minimum0.91
Maximum2.41
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2024-07-09T19:23:03.201330image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0.91
5-th percentile1.52
Q11.63
median1.7
Q31.78
95-th percentile1.88
Maximum2.41
Range1.5
Interquartile range (IQR)0.15

Descriptive statistics

Standard deviation0.10705306
Coefficient of variation (CV)0.06287678
Kurtosis0.17779989
Mean1.7025849
Median Absolute Deviation (MAD)0.08
Skewness0.029398198
Sum567578.81
Variance0.011460357
MonotonicityNot monotonic
2024-07-09T19:23:03.286571image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.68 29528
 
8.3%
1.63 28455
 
8.0%
1.7 27326
 
7.7%
1.65 26257
 
7.4%
1.78 25750
 
7.2%
1.73 24755
 
7.0%
1.75 23239
 
6.5%
1.6 22692
 
6.4%
1.83 22600
 
6.3%
1.57 21626
 
6.1%
Other values (97) 81135
22.8%
(Missing) 22742
 
6.4%
ValueCountFrequency (%)
0.91 19
< 0.1%
0.92 1
 
< 0.1%
0.95 1
 
< 0.1%
0.97 4
 
< 0.1%
1 3
 
< 0.1%
1.02 2
 
< 0.1%
1.03 1
 
< 0.1%
1.04 15
< 0.1%
1.05 22
< 0.1%
1.06 4
 
< 0.1%
ValueCountFrequency (%)
2.41 4
 
< 0.1%
2.34 4
 
< 0.1%
2.29 4
 
< 0.1%
2.26 10
< 0.1%
2.24 1
 
< 0.1%
2.21 7
 
< 0.1%
2.18 6
 
< 0.1%
2.16 7
 
< 0.1%
2.13 22
< 0.1%
2.11 21
< 0.1%

WeightInKilograms
Real number (ℝ)

MISSING 

Distinct582
Distinct (%)0.2%
Missing33567
Missing (%)9.4%
Infinite0
Infinite (%)0.0%
Mean83.074472
Minimum22.68
Maximum292.57
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2024-07-09T19:23:03.376548image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum22.68
5-th percentile54.43
Q168.04
median80.74
Q395.25
95-th percentile122.47
Maximum292.57
Range269.89
Interquartile range (IQR)27.21

Descriptive statistics

Standard deviation21.455636
Coefficient of variation (CV)0.2582699
Kurtosis2.8102708
Mean83.074472
Median Absolute Deviation (MAD)12.7
Skewness1.0827284
Sum26794674
Variance460.3443
MonotonicityNot monotonic
2024-07-09T19:23:03.460152image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90.72 17120
 
4.8%
81.65 15773
 
4.4%
68.04 14138
 
4.0%
72.57 13770
 
3.9%
77.11 12778
 
3.6%
86.18 11293
 
3.2%
63.5 10350
 
2.9%
79.38 9354
 
2.6%
74.84 8675
 
2.4%
99.79 8673
 
2.4%
Other values (572) 200614
56.3%
(Missing) 33567
 
9.4%
ValueCountFrequency (%)
22.68 8
< 0.1%
23 1
 
< 0.1%
23.13 1
 
< 0.1%
23.59 2
 
< 0.1%
24 1
 
< 0.1%
24.04 3
 
< 0.1%
24.49 2
 
< 0.1%
24.95 3
 
< 0.1%
25.4 1
 
< 0.1%
25.85 3
 
< 0.1%
ValueCountFrequency (%)
292.57 1
< 0.1%
290.3 2
< 0.1%
285 1
< 0.1%
281.68 1
< 0.1%
281 1
< 0.1%
280.32 1
< 0.1%
280 1
< 0.1%
278.96 1
< 0.1%
276.24 1
< 0.1%
274.42 1
< 0.1%

BMI
Real number (ℝ)

MISSING 

Distinct3823
Distinct (%)1.2%
Missing38866
Missing (%)10.9%
Infinite0
Infinite (%)0.0%
Mean28.532016
Minimum12.02
Maximum99.64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2024-07-09T19:23:03.545552image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum12.02
5-th percentile20.12
Q124.13
median27.44
Q331.75
95-th percentile40.69
Maximum99.64
Range87.62
Interquartile range (IQR)7.62

Descriptive statistics

Standard deviation6.5547862
Coefficient of variation (CV)0.22973443
Kurtosis4.3990018
Mean28.532016
Median Absolute Deviation (MAD)3.73
Skewness1.3837698
Sum9051468.1
Variance42.965223
MonotonicityNot monotonic
2024-07-09T19:23:03.626968image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26.63 3382
 
0.9%
27.46 2624
 
0.7%
24.41 2543
 
0.7%
27.44 2541
 
0.7%
27.12 2492
 
0.7%
25.1 2177
 
0.6%
32.28 1925
 
0.5%
29.53 1871
 
0.5%
25.84 1853
 
0.5%
29.29 1853
 
0.5%
Other values (3813) 293978
82.6%
(Missing) 38866
 
10.9%
ValueCountFrequency (%)
12.02 1
 
< 0.1%
12.05 1
 
< 0.1%
12.06 1
 
< 0.1%
12.11 3
< 0.1%
12.16 5
< 0.1%
12.19 1
 
< 0.1%
12.2 1
 
< 0.1%
12.21 3
< 0.1%
12.24 1
 
< 0.1%
12.27 2
 
< 0.1%
ValueCountFrequency (%)
99.64 1
 
< 0.1%
97.65 3
< 0.1%
95.66 1
 
< 0.1%
94.66 1
 
< 0.1%
93.88 2
< 0.1%
93.51 1
 
< 0.1%
93.41 1
 
< 0.1%
92.22 1
 
< 0.1%
92.01 1
 
< 0.1%
91.72 1
 
< 0.1%

AlcoholDrinkers
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing37137
Missing (%)10.4%
Memory size695.6 KiB
True
168859 
False
150109 
(Missing)
37137 
ValueCountFrequency (%)
True 168859
47.4%
False 150109
42.2%
(Missing) 37137
 
10.4%
2024-07-09T19:23:03.704310image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

HIVTesting
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing52878
Missing (%)14.8%
Memory size695.6 KiB
False
199785 
True
103442 
(Missing)
52878 
ValueCountFrequency (%)
False 199785
56.1%
True 103442
29.0%
(Missing) 52878
 
14.8%
2024-07-09T19:23:03.757068image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

FluVaxLast12
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing37622
Missing (%)10.6%
Memory size695.6 KiB
True
167532 
False
150951 
(Missing)
37622 
ValueCountFrequency (%)
True 167532
47.0%
False 150951
42.4%
(Missing) 37622
 
10.6%
2024-07-09T19:23:03.813454image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

PneumoVaxEver
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing61588
Missing (%)17.3%
Memory size695.6 KiB
False
172504 
True
122013 
(Missing)
61588 
ValueCountFrequency (%)
False 172504
48.4%
True 122013
34.3%
(Missing) 61588
 
17.3%
2024-07-09T19:23:03.864352image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

TetanusLast10Tdap
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing65972
Missing (%)18.5%
Memory size31.0 MiB
No, did not receive any tetanus shot in the past 10 years
97238 
Yes, received tetanus shot but not sure what type
90876 
Yes, received Tdap
80055 
Yes, received tetanus shot, but not Tdap
21964 

Length

Max length57
Median length49
Mean length42.446188
Min length18

Characters and Unicode

Total characters12315040
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYes, received Tdap
2nd rowYes, received tetanus shot but not sure what type
3rd rowNo, did not receive any tetanus shot in the past 10 years
4th rowNo, did not receive any tetanus shot in the past 10 years
5th rowYes, received tetanus shot but not sure what type

Common Values

ValueCountFrequency (%)
No, did not receive any tetanus shot in the past 10 years 97238
27.3%
Yes, received tetanus shot but not sure what type 90876
25.5%
Yes, received Tdap 80055
22.5%
Yes, received tetanus shot, but not Tdap 21964
 
6.2%
(Missing) 65972
18.5%

Length

2024-07-09T19:23:03.932645image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:03.998790image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
not 210078
 
8.8%
tetanus 210078
 
8.8%
shot 210078
 
8.8%
received 192895
 
8.1%
yes 192895
 
8.1%
but 112840
 
4.7%
tdap 102019
 
4.3%
10 97238
 
4.1%
years 97238
 
4.1%
no 97238
 
4.1%
Other values (9) 856056
36.0%

Most occurring characters

ValueCountFrequency (%)
2088520
17.0%
e 1649600
13.4%
t 1329380
10.8%
s 898403
 
7.3%
a 694687
 
5.6%
n 614632
 
5.0%
o 517394
 
4.2%
d 489390
 
4.0%
i 484609
 
3.9%
r 478247
 
3.9%
Other values (14) 3070178
24.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 12315040
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2088520
17.0%
e 1649600
13.4%
t 1329380
10.8%
s 898403
 
7.3%
a 694687
 
5.6%
n 614632
 
5.0%
o 517394
 
4.2%
d 489390
 
4.0%
i 484609
 
3.9%
r 478247
 
3.9%
Other values (14) 3070178
24.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 12315040
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2088520
17.0%
e 1649600
13.4%
t 1329380
10.8%
s 898403
 
7.3%
a 694687
 
5.6%
n 614632
 
5.0%
o 517394
 
4.2%
d 489390
 
4.0%
i 484609
 
3.9%
r 478247
 
3.9%
Other values (14) 3070178
24.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 12315040
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2088520
17.0%
e 1649600
13.4%
t 1329380
10.8%
s 898403
 
7.3%
a 694687
 
5.6%
n 614632
 
5.0%
o 517394
 
4.2%
d 489390
 
4.0%
i 484609
 
3.9%
r 478247
 
3.9%
Other values (14) 3070178
24.9%

HighRiskLastYear
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing40437
Missing (%)11.4%
Memory size695.6 KiB
False
301926 
True
 
13742
(Missing)
40437 
ValueCountFrequency (%)
False 301926
84.8%
True 13742
 
3.9%
(Missing) 40437
 
11.4%
2024-07-09T19:23:04.063495image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

CovidPos
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing40556
Missing (%)11.4%
Memory size20.6 MiB
No
216088 
Yes
88712 
Tested positive using home test without a health professional
 
10749

Length

Max length61
Median length2
Mean length4.2909374
Min length2

Characters and Unicode

Total characters1354001
Distinct characters22
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowTested positive using home test without a health professional
4th rowNo
5th rowYes

Common Values

ValueCountFrequency (%)
No 216088
60.7%
Yes 88712
24.9%
Tested positive using home test without a health professional 10749
 
3.0%
(Missing) 40556
 
11.4%

Length

2024-07-09T19:23:04.129783image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-09T19:23:04.192745image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
no 216088
53.8%
yes 88712
22.1%
tested 10749
 
2.7%
positive 10749
 
2.7%
using 10749
 
2.7%
home 10749
 
2.7%
test 10749
 
2.7%
without 10749
 
2.7%
a 10749
 
2.7%
health 10749
 
2.7%

Most occurring characters

ValueCountFrequency (%)
o 269833
19.9%
N 216088
16.0%
e 163955
12.1%
s 153206
11.3%
Y 88712
 
6.6%
85992
 
6.4%
t 75243
 
5.6%
i 53745
 
4.0%
h 42996
 
3.2%
a 32247
 
2.4%
Other values (12) 171984
12.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1354001
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 269833
19.9%
N 216088
16.0%
e 163955
12.1%
s 153206
11.3%
Y 88712
 
6.6%
85992
 
6.4%
t 75243
 
5.6%
i 53745
 
4.0%
h 42996
 
3.2%
a 32247
 
2.4%
Other values (12) 171984
12.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1354001
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 269833
19.9%
N 216088
16.0%
e 163955
12.1%
s 153206
11.3%
Y 88712
 
6.6%
85992
 
6.4%
t 75243
 
5.6%
i 53745
 
4.0%
h 42996
 
3.2%
a 32247
 
2.4%
Other values (12) 171984
12.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1354001
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 269833
19.9%
N 216088
16.0%
e 163955
12.1%
s 153206
11.3%
Y 88712
 
6.6%
85992
 
6.4%
t 75243
 
5.6%
i 53745
 
4.0%
h 42996
 
3.2%
a 32247
 
2.4%
Other values (12) 171984
12.7%

Interactions

2024-07-09T19:22:54.386762image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.201313image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.709638image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.272200image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.878217image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.781655image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.485246image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.294302image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.803891image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.370634image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.053112image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.880229image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.568826image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.378276image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.892450image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.466407image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.351331image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.996913image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.656314image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.462556image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.980186image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.557687image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.453961image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.102516image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.749512image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.553282image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.079501image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.672506image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.562957image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.195810image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.829657image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:51.635527image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.167652image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:52.765540image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:53.677987image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-07-09T19:22:54.299791image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Missing values

2024-07-09T19:22:55.191945image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-09T19:22:56.277524image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-07-09T19:22:59.321678image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
0UtahFemaleVery good0.00.0Within past 5 years (2 years but less than 5 years ago)Yes7.0None of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNot at all (right now)NoWhite only, Non-HispanicAge 45 to 491.6051.7120.19YesYesYesNoYes, received TdapNoNo
1District of ColumbiaFemaleFair4.00.0Within past year (anytime less than 12 months ago)No8.06 or more, but not allNoNoNoYesNoNoNoYesYesYesNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNaNBlack only, Non-HispanicAge 75 to 791.6054.4321.26NoNoNoYesNaNNoNo
2WashingtonMaleGood1.01.0Within past year (anytime less than 12 months ago)Yes9.0None of themNoNoYesNaNYesNoNoNoYesYesNoNoNoYesNoNoFormer smokerNever used e-cigarettes in my entire lifeNaNWhite only, Non-HispanicAge 75 to 791.7088.4530.54YesYesYesYesYes, received tetanus shot but not sure what typeNoTested positive using home test without a health professional
3WisconsinFemaleGood0.00.0Within past 5 years (2 years but less than 5 years ago)No9.06 or more, but not allNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesBlack only, Non-HispanicAge 75 to 791.6869.4024.69NoNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoNo
4KansasMaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes8.01 to 5NoNoNoNoNoNoNoNoYesNoNoNoNoNoNoNoFormer smokerNot at all (right now)NoBlack only, Non-HispanicAge 70 to 741.75108.8635.44NoNoYesYesNo, did not receive any tetanus shot in the past 10 yearsNoYes
5MaineFemaleVery good0.00.0Within past year (anytime less than 12 months ago)Yes7.06 or more, but not allNoNoNoNoNoNoNoNoYesYesNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesHispanicAge 80 or older1.6854.4319.37NoYesYesYesYes, received tetanus shot but not sure what typeNoNo
6MassachusettsFemaleFair3.030.0Within past year (anytime less than 12 months ago)No15.01 to 5NoNoNoYesNoNoYesNoYesNoNoNoYesNoNoNoCurrent smoker - now smokes every dayNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 50 to 541.6572.5726.63NoYesYesNoYes, received TdapYesNo
7District of ColumbiaMaleGood10.010.0Within past year (anytime less than 12 months ago)Yes8.0None of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoBlack only, Non-HispanicAge 30 to 341.7086.1829.76YesYesNoNoYes, received tetanus shot but not sure what typeNoYes
8CaliforniaMaleVery good1.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 25 to 291.6881.6529.05NoNoNoYesYes, received tetanus shot but not sure what typeNoNo
9New JerseyFemaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes9.0None of themNoNoNoNoNoNoNoNoYesNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 60 to 641.6347.6318.02YesNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoNo
StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
356095OhioFemaleFair30.05.0Within past year (anytime less than 12 months ago)Yes5.06 or more, but not allNoNoNoNoNoNoNoYesNoNoNoNoNoNoNoNoNaNNaNNaNWhite only, Non-HispanicAge 60 to 641.6577.1128.29NaNNaNNaNNaNNaNNaNNaN
356096North DakotaFemaleGood0.05.0Within past year (anytime less than 12 months ago)No5.0None of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 50 to 541.7082.5528.50NoYesYesYesYes, received tetanus shot but not sure what typeNoNo
356097MississippiFemaleFair5.00.0Within past year (anytime less than 12 months ago)Yes7.01 to 5NoNaNNoYesNoYesNoNoYesNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 45 to 491.70NaNNaNNoNoNoNaNNaNNoNo
356098WashingtonMaleFair0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoYesNoNoNoYesYesNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 80 or older1.7074.8425.84NoNoYesYesYes, received tetanus shot but not sure what typeNoNo
356099OhioFemaleVery good0.00.0Within past 2 years (1 year but less than 2 years ago)Yes8.0None of themNoNoNoNoNoNoYesNoNoNoNoNoNoNoNoNoFormer smokerNaNYesWhite only, Non-HispanicAge 50 to 541.5799.7940.24NoNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoNo
356100TexasFemaleGood0.00.0Within past year (anytime less than 12 months ago)Yes7.06 or more, but not allNoNoNoNoNoNoNoNoYesNoNoNoNoYesNoNoFormer smokerNever used e-cigarettes in my entire lifeNoHispanicAge 65 to 691.5777.1131.09NoNoYesNoNo, did not receive any tetanus shot in the past 10 yearsNoNo
356101VirginiaFemaleFair3.05.0Within past year (anytime less than 12 months ago)Yes11.01 to 5YesNoYesNoNoNoYesNoYesNoNaNNaNNaNNaNNaNNaNNaNNaNNaNWhite only, Non-HispanicAge 80 or olderNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
356102IndianaFemaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes6.06 or more, but not allNoNoYesNoNoNoNoNoYesNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 80 or older1.5263.9627.54NoNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoNo
356103IndianaFemaleExcellent30.030.0Within past year (anytime less than 12 months ago)No4.0None of themNoNoNoNoYesNoYesNoYesNoNoNoNoYesNoNoCurrent smoker - now smokes every dayNot at all (right now)YesWhite only, Non-HispanicAge 45 to 491.7384.3728.28YesYesNoNoYes, received TdapNoNo
356104NebraskaFemaleVery good0.00.0Within past year (anytime less than 12 months ago)No7.06 or more, but not allNoNoNoNoNoNoNoNoYesNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 80 or older1.5256.7024.41YesNoYesYesNo, did not receive any tetanus shot in the past 10 yearsNoNo

Duplicate rows

Most frequently occurring

StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos# duplicates
9ConnecticutMaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoNoNoNoNoNoNoNaNNaNNaNNaNNaNNaNNaNNaNNaNWhite only, Non-HispanicAge 55 to 59NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN4
24LouisianaFemaleNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN4
6ColoradoMaleGood0.00.0NaNYes8.0None of themNoNoNoNoNoNoNoNoNoNoNaNNaNNaNNaNNaNNaNNaNNaNNaNHispanicAge 18 to 24NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3
12FloridaFemaleVery good0.00.0Within past year (anytime less than 12 months ago)Yes8.0None of themNoNoNoNoNoNoNoNoNoNoNaNNaNNaNNaNNaNNaNNaNNaNNaNWhite only, Non-HispanicAge 65 to 69NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3
53New YorkMaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoNoNoNoNoNoNoNaNNaNNaNNaNNaNNaNNaNNaNNaNWhite only, Non-HispanicAge 50 to 54NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3
54New YorkMaleGood0.00.0Within past year (anytime less than 12 months ago)Yes8.0None of themNoNoNoNoNoNoNoNoNoNoNaNNaNNaNNaNNaNNaNNaNNaNNaNHispanicAge 40 to 44NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3
72TexasMaleVery good0.00.0Within past year (anytime less than 12 months ago)Yes8.0None of themNoNoNoNoNoNoNoNoNoNoNaNNaNNaNNaNNaNNaNNaNNaNNaNHispanicAge 18 to 24NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3
0AlaskaMaleVery good0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoNoNoNoNoNoNoNaNNaNNaNNaNNaNNaNNaNNaNNaNWhite only, Non-HispanicAge 65 to 69NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2
1ArizonaFemaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoYesNoNoNoYesNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 75 to 791.6356.721.46YesNoYesYesYes, received TdapNoNo2
2CaliforniaFemaleNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2